Discovery of andrographolide hit analog as a potent cyclooxygenase-2 inhibitor through consensus MD-simulation, electrostatic potential energy simulation and ligand efficiency metrics

Cyclooxygenase-2 (COX-2) is the key enzyme responsible for the conversion of arachidonic acid to prostaglandins that display pro-inflammatory properties and thus, it is a potential target protein to develop anti-inflammatory drugs. In this study, chemical and bio-informatics approaches have been employed to find a novel potent andrographolide (AGP) analog as a COX-2 inhibitor having better pharmacological properties than aspirin and rofecoxib (controls). The full amino acid sequenced human Alpha fold (AF) COX-2 protein (604AA) was selected and validated for its accuracy against the reported COX-2 protein structures (PDB ID: 5F19, 5KIR, 5F1A, 5IKQ and 1V0X) followed by multiple sequence alignment analysis to establish the sequence conservation. The systematic virtual screening of 237 AGP analogs against AF-COX-2 protein yielded 22 lead compounds based on the binding energy score (< − 8.0 kcal/mol). These were further screened out to 7 analogs by molecular docking analysis and investigated further for ADMET prediction, ligand efficiency metrics calculations, quantum mechanical analysis, MD simulation, electrostatic potential energy (EPE) docking simulation, and MM/GBSA. In-depth analysis revealed that AGP analog A3 (3-[2-[(1R,4aR,5R,6R,8aR)-6-hydroxy-5,6,8a-trimethyl-2-methylidene-3,4,4a,5,7,8-hexahydro-1H-naphthalen-1-yl]ethylidene]-4-hydroxyoxolan-2-one) forms the most stable complex with the AF-COX-2 showing the least RMSD value (0.37 ± 0.03 nm), a good number of hydrogen bonds (protein–ligand H-bond = 11, and protein H-bond = 525), minimum EPE score (− 53.81 kcal/mol), and lowest MM-GBSA before and after simulation (− 55.37 and − 56.25 kcal/mol, respectively) value compared to other analogs and controls. Thus, we suggest that the identified A3 AGP analog could be developed as a promising plant-based anti-inflammatory drug by inhibiting COX-2.

www.nature.com/scientificreports/ Multiple sequence alignment. To verify the presence of the conserved amino acids in the AF-COX-2 protein, multiple sequence alignment (MSA) was performed using the PRALINE program 46 . For the same, 10 different mammal species namely Homo sapiens, Rattus norvegicus, Mus musculus, Ovis aries, Bos Taurus, Oryctolagus cuniculus, Cavia porcellus, Equus caballus, Neovison vison, and Gallus gallus were selected and amino acid sequences of their respective COX-2 protein were retrieved from Uniprot (http:// www. unipr ot. org/). The resulting aligned amino acid sequences were colored based on the conservation index (0 to10), wherein the highly conserved sequences of amino acid amongst species are responsible for the particular biological function of the protein 47 .
Site-directed virtual screening and molecular docking analysis. A total of 237 AGP analogs were taken from the PubChem database that consists of the common α,β-unsaturated γ-lactone moiety which is responsible for the biological activity by forming adduct with the residues during the biological interaction 25,26 . Then, the analogs were converted into a autodock PDBQT format for virtual screening by Open Babel 48,49 . PyRx (AutoDock Vina) software was used for the site-directed virtual screening of the AGP analogs against the AF-COX-2 protein 50 . Further, based on the interacting amino acid residues present for the co-crystal aspirin and rofecoxib in the 5F19 and 5KIR PDB structure respectively, amino acids namely His75, Arg106, Val330, Tyr334, Val335, Tyr341, Tyr371, Trp373, Arg499, Phe504, Glu510, Ser516, and Leu517 were chosen as active site residues in the AF-COX-2 protein for setting up the grid for the virtual screening 35,36 . The grid box was set with search space center 4.76, 5.07, and − 0.12 at x, y, and z, respectively, with 8 exhaustiveness, while the dimensions (Å) were set for the grid at x, y, and z with a value of 29.54, 29.74, and 27.37 each. For shortlisting the hit AGP analogs, the cutoff value for binding energy (BE) was set to be lesser than − 8.00 kcal/mol. The hit AGP analogs obtained from virtual screening were submitted for molecular docking analysis along with AGP, aspirin, and rofecoxib as controls. AutoDock4.2 was utilized for the site-directed docking of these compounds with AF-COX-2 51 . Since the AF protein structure of COX-2 is a predicted protein structure, cocrystal ligands, i.e. aspirin and rofecoxib of the existing crystallized structure of COX-2 protein 5F19 and 5KIR, respectively, were taken as reference compounds for the docking analysis. Further, amino acids i.e. His75, Arg106, Val330, Tyr334, Val335, Tyr341, Tyr371, Trp373, Arg499, Phe504, Glu510, Ser516, and Leu517 were chosen as the binding site residues based on the aspirin and rofecoxib interacting residues in the 5F19 and 5KIR PDB structures. The gird box for docking was generated with 0.375 Å spacing and dimension along the x, y, and z coordinates with the values of 84, 88, and 94, respectively. The grid center was set to the centroid of the binding site residues with the dimension of 4.305, 4.394, and 1.611, respectively, alongside the x, y, and z-Axis. Further, GA (genetic algorithm) was employed to calculate the docking parameters with 25000000 number of energy evaluations, 27000 number of generations, and 2.0 Å RMSC (root mean square cluster) tolerance. Lamarckian genetic algorithm was used to perform docking simulation. The AutoGrid and AutoDock were utilized to perform molecular docking and to investigate the best binding pose of the ligands in protein-ligand complexes. Finally, the ligand's conformers with the lowest free binding energy were selected for further analysis. The output files obtained from the docking study were visualized and analyzed by ICM-Browser and Accelrys discovery studio visualizer 52 . ADMET prediction. ADMETlab 2.0 was used to predict physicochemical, medicinal, absorption, distribution, metabolism, excretion, and toxicological parameters for AGP and shortlisted AGP analogs along with aspirin and rofecoxib 53 . It was also used to establish the drug-likeness of the AGP and shortlisted analogs.
Ligand efficiency metrics. Ligand binding efficiency metrics were computed in terms of inhibition constant (Ki), ligand efficiency (LE), ligand lipophilic efficiency (LLE), ligand efficiency scaling function (LE_Scale), fit quality (FQ), and lipophilicity corrected ligand efficiency (LELP) by following the Eq. (1-6) 54-58 . Quantum-mechanical descriptors calculation. Spartan 20 (wavefun.com) package was used to calculate quantum-mechanical (QM) descriptors in terms of the highest occupied molecular orbital (HOMO), lowest unoccupied molecular orbital (LUMO), HOMO-LUMO gap (HLG), and EPE parameters for assessing the molecular orbital properties, electrostatic potential, and elucidating various types of interactions. All the chemical compounds were optimized to compute equilibrium geometric variables in the gaseous state at the ground employing density functional theory (DFT) with Becke's three-parameter exchange potential 59 . Further,

Electrostatic potential energy simulation of peptide-ligand adduct
To assess the stability of adducts of AGP, hit analogs, aspirin, and rofecoxib with the selected peptide length of AF-COX-2, the EPE docking simulation study was carried out using Spartan 20. For the same, amino acid sequence length from 511 to 520 of AF-COX-2 (i.e. Val511, Gly512, Ala513, Pro514, Phe515, Ser516, Leu517, Lys518, Gly519, and Leu520) was selected as a peptide for forming the adducts with the compounds. For the same, first a model consisting of 10 amino acid residue peptide and one ligand was built. This was followed by energy minimization with the molecular mechanics option followed by the B3LYP/6-31G* single-point calculations. The energy of adduct was obtained directly from the output of the computations. Further, to calculate the electronic parameters, the data related to B3LYP/6-31G* basis set was taken as the number of unpaired electrons (multiplicity) = 0, total charge = neutral coupling, and coupling constant = empirical and stabilization energy was determined by deducing the sum of the energy of the individual constituents from the adduct energy. Then, EPE was calculated for each adduct and peptide which characterized the electron distribution on the surface of compounds 66 . Finally, the binding energy of complex compounds with peptide was calculated using the following Eq. (7). MM/GBSA rescoring study. The binding free energies in terms of molecular mechanics generalized born surface area (MM/GBSA) of AGP's hit analogs, AGP, aspirin, and rofecoxib complexes were calculated by Fast Amber Rescoring (FAR) 67 . The ff14SB force field was used for the small molecules, whereas the Generalized Amber Force Field2 (GAFF2) was employed for protein analysis 68 . Further, the AM1-BCC method was used to assign the partial charge in the analogs through an antechamber module of Amber 69,70 .

Results and discussion
Selection, validation, and binding site prediction of protein. The full sequenced COX-2 protein structure AF-P35354-F1, comprising 1-604 amino acids, was used in this study, which is modeled by AlphaFold and uses a state-of-the-Art machine learning method for protein modeling. Before performing any computational analysis, this protein structure was validated by comparing the overall quality of the protein with the existing four crystallized COX-2 protein structures (PDB ID: 5F19, 5KIR, 5F1A, and 5IKQ) and a modeled structure (PDB ID: 1V0X). Supplementary Table S1 summarizes the results obtained from SAVES, ProSA, and ProQ for all the protein structures, which use ERRAT, VERIFY, PROVE, Whatcheck, Procheck, Z score, and LGscore to assess the quality of the protein by calculating the overall quality factor, compatibility of an atomic 3D model with its amino acid sequence, volumes of atoms, stereochemical parameters, and stereochemical quality. The AF-COX-2 protein structure successfully passed the VERIFY and PROVE with 97.74% overall quality factor with 90.9% residues in the most favored reason. In addition, z-score and LGscore values for the AF-COX-2 protein were found to be -8.97, and 7.639, respectively which indicates that AF-COX-2 model protein structure is of good quality 40,42 . Further, the overall quality of the AF-COX-2 protein structure was found to be the best among all other structures in terms of quality, compatibility, and stereochemistry parameters, and hence this 3D protein structure was used for further studies.
Multiple sequence alignment to predict active binding site. The AF-COX-2 protein has a modeled 3D structure with no co-crystallized ligand, therefore, binding site residue prediction in the refined protein structure was carried out by comparing it with the active site of the existing crystallized PDB structure of COX-2 present in the PDB database. For the same, 5F19 and 5KIR PDB structures of COX-2 were selected which have aspirin and rofecoxib as co-crystal ligands, respectively. Based on the binding sites of both the co-crystal ligands in the protein structure, His75, Arg106, Val330, Tyr334, Tyr341, Tyr371, Trp373, Arg499, Phe504, Glu510, Ser516, and Leu517 amino acids were selected as the active sites for binding interaction studies. To verify these residues as the prominent active site, multiple sequence alignment was performed using the COX-2 protein of 10 different mammalian species, which proved that except for Arg499, all the selected amino acids are fully conserved (Supplementary Fig. S1 and Table S3). This also signifies the importance of these amino acids in various biological functions and as a potential target for drug development 47 .
Site-directed virtual screening and molecular docking analysis. A set of 237 andrographolide analogs that consists of the common α,β-unsaturated γ-lactone moiety which is important for biological activity were screened against the full sequence validated AF-COX-2 protein to find out a potent plant based antiinflammatory compound 25,26 . The site directed virtual screening was utilized to discover the hit AGP analogs by computational approach. It calculates the binding energy for all the subjected analogs by the results of proteinligand interaction. Subsequently, the energetically more favorable analogs' complexes were selected for further analysis. The results from the virtual screening analysis revealed that the lowest binding energy was found to be − 8.80 kcal/mol for PubChem ID: 132210370, while the highest binding energy was determined to be − 5.90 kcal/mol for PubChem ID:59876522 (Supplementary Table S4). Thereafter, based on the binding energy cutoff value (− 8.00 kcal/mol), 22 AGP analogs out of 237 were screened for further analysis. Therefore, these 22 AGP analogs were subjected to molecular docking studies to find out the potent COX-2 inhibitor along with AGP, and control drugs (aspirin and rofecoxib). The binding energy values obtained from the molecular docking analysis were found to be in the range of − 9.35 kcal/mol (PubChem ID: 132210508) to − 3.2 kcal/mol (PubChem ID: 132217538) (Supplementary Table S5). In contrast, aspirin, rofecoxib, and AGP showed binding energy values of − 5.61 kcal/mol, − 8.17 kcal/ mol, and − 7.95 kcal/mol, respectively. Out of 22 AGP analogs, 7 hit analogs, named A1-A7, were selected for further analysis based on the binding energy cutoff value of lesser than − 8.00 kcal/mol. In terms of binding energy, all seven hit analogs showed lesser value compared to aspirin and AGP, while five analogs (A1-A5) exhibited lesser value than rofecoxib (Table 1), indicating their potential towards COX-2 protein inhibition.
Comparative molecular interactions and binding pose analysis. The binding pattern of AGP and its selected seven hit analogs with respect to the existing COX-2 protein inhibitors, i.e., aspirin and rofecoxib (controls), was examined to compare the molecular binding interaction and binding pose analysis ( Table 2). This analysis reveals that Val335, Leu338, Ser339, Phe504, Val509, and Ala513 amino acids are common in the interactions of all the subjected compounds with AF-COX-2 protein. In addition, Trp373, Gly512, and Ser516 amino acids are shared by aspirin, AGP, and A1-A7 analogs, whereas Leu78, Val102, Arg106, Tyr341, Leu345, Leu517 residues are occupied by rofecoxib, AGP, and A1-A7. Further, among all the interacting amino acids, Arg106, Tyr371, Arg499, Met508, Val509, Glu510, Ala513, and Ser516 residues were involved in H-bond interaction, while Leu78, Val102, Arg106, Val335, Leu338, Tyr341, Leu345, Phe504, Met508, Val509, Ala513, Leu517 residues were engaged in the HYD interaction. AGP and hit analogs A1-A5 showed 3, 3, 4, 1, 3, and 4 H-bonds, respectively, while no H-bond was formed in the case of A6 and A7 analog. Intriguingly, it is evident from the binding pattern that AGP and A1-A7 analogs share the interacting amino acid residues of both aspirin and rofecoxib binding site of AF-COX-2 protein (Figs. 1 and 2), indicating their potential as COX-2 protein inhibitor with improved binding interaction. Further, the binding orientation of AGP and hit analogs display that oxalone moiety is faced towards the aspirin binding site, while the naphthalene/naphthol moiety is positioned towards the rofecoxib binding site ( Supplementary Fig. S2).
ADMET prediction. The physicochemical, absorption, distribution, metabolism, excretion, and toxicological properties of AGP and A1-A7 analogs were predicted and compared with aspirin and rofecoxib to evaluate their drug-likeness (Table 3). Physicochemical properties such as molecular weight (MW), number of H-bond acceptors (nHA), number of H-bond donors (nHD), topological polar surface area (TPSA), and the logarithm of the n-octanol/water distribution coefficient (logP) were calculated for all the selected compounds. Further, medicinal chemistry properties, i.e. natural product-likeness score (NPscore), Lipinski rule, and Pfizer rule, were analyzed for all the compounds. Except for rofecoxib and A5 analog, all the compounds fulfilled the criteria for physicochemical and medicinal chemistry properties for drug-like candidates. Furthermore, the absorption properties such as Caco-2 (the human colon adenocarcinoma cell lines) permeability, Pgp-inhibitory activity (inhibitor of P-glycoprotein), human intestinal absorption (HIA range 0-0.3), and F 30% (the human oral bioavailability 30%) were also computed for all the compounds. Except for rofecoxib and A6 analog, all the compounds fulfilled the criteria to be used as orally active drugs. In addition, distribution properties for example    www.nature.com/scientificreports/ plasma protein binding (PPB), the volume of distribution (VD), blood-brain barrier (BBB) penetration, and fraction unbound in plasma (Fu) were predicted for all the studied molecules. The distribution properties of AGP and A1-A7 analogs were found to be in the acceptable range for a drug-like candidate. Moreover, the cumulative analysis of metabolism and excretion properties also favored the AGP and A1-A7 analog as potential drug candidates. Further, the toxicological properties assessment also predicted the non-toxic nature of the A1-A7 AGP analogs.
Ligand efficiency metrics. The ligand binding quality of the selected was calculated based on the molecular mass and lipophilicity with the context of the particular protein target. Typically, a low Ki value refers to high potency and it should be in the micromolar range for a molecule to be qualified as a hit or lead compound 55 . Ligand efficiency (threshold value = 0.3 kcal/mol/HA) gives an idea about how well a compound binds for its size with the protein 56 . Ligand lipophilic efficiency (threshold value = 3) is used to measure the affinity of a molecule towards lipophilicity which is the major factor for the promiscuity of the selective LLE and optimized  www.nature.com/scientificreports/ www.nature.com/scientificreports/ compounds 54,56,57 . LE_Scale is a size-independent scaling function of ligand efficiency and is derived by fitting an exponential function to the maximal ligand efficiency values observed for a given heavy atom (HA). Fit quality (threshold value = 0.8), also called as scaled ligand efficiency, includes ligand efficiency and size into a single metric and results by dividing the observed LE with LE_scale 55,56,58 . Lastly, LELP (acceptable range = -10 to + 10) was calculated which presents a metric that indicates the price of ligand efficiency paid in lipophilicity 54,55 . Further, binding energy (BE) obtained from the docking studies (Table 1), HA and LogP acquired from physicochemical properties prediction (Table 3) were taken for the calculation of all the LE metrics. It can be observed from Table 4 that except for A5 and controls (aspirin and rofecoxib), all the screened analogs show the respective ligand efficiency metrics values which surpass the minimum thresholds and to be called as HIT analogs. Also, in terms of inhibition constant, analogs A1-A5 displayed a Ki value smaller than the controls (aspirin Ki = 75.604 µM and rofecoxib Ki = 0.995 µM) and AGP (Ki = 1.440 µM) indicating their greater binding affinity towards the inhibition of COX-2 protein than controls and AGP.
Interpretation of quantum mechanical descriptors. The frontier molecular orbitals (FMOs) such as HOMO, LUMO, and HLG were calculated to anticipate the electronic properties of the AGP, hit analogs, aspirin, and rofecoxib against AF-COX-2. In the case of AGP and A1-A7 analogs, HOMO and LUMO orbitals locate on the naphthalene/naphthol and oxalone moiety, respectively, whereas, in the case of aspirin, HOMO and LUMO orbitals localize on the benzoic acid part of the structure. In contrast, HOMO orbitals confine to the phenylfuran moiety while the LUMO orbitals position on the methylsulfonylpheny-furan moiety of rofecoxib (Fig. 3). The HLG between two energetic states, i.e. HOMO and LUMO, explains the chemical reactivity of molecules where the lowest HLG value infers greater chemical reactivity and biological activity, and lesser stability of the compound by making the flow of electrons easier 72 . It also plays an important role in protein-ligand complex stabilization 73  Electrostatic potential energy map of the chemical compounds. The EPE map is used to predict the active sites for intermolecular interactions as a spatial electron density distribution over the chemical compounds 74,75 . Figure 4 shows the EPE map of aspirin, rofecoxib, AGP, and A1-A7 in which the blue color region represents the most positive potential that is responsible for nucleophilic interaction while the red color indicates the most negative potential that is the site for electrophilic interaction. It is evident from the EPE map analysis that acetyloxy and carboxyl groups in aspirin, methylsulfonyl and furan groups in rofecoxib, and oxalone and hydroxyl groups of naphthalene moiety in AGP and hit analogs are in the negative potential (red region) and thus these are the prominent site for the electrophilic interactions. In contrast, the positive potential (blue region), which is a potential site for nucleophilic interactions, lies around the hydrogen of the carboxyl group in aspirin, phenyl group of rofecoxib, and naphthalene or naphthol moiety in AGP and A1-A7 analogs. The positive EPE (kcal/mol) were found to be in the order of rofecoxib ( (Fig. 4). As the negative value of EPE favors the stronger H-bond formation 76,77 , the AGP and its A1-A7 hit analogs showed better potential towards stronger H-bond interaction compared to the aspirin and rofecoxib.

Structure activity relationship (SAR) of the screened AGP analogs.
Structurally, all the screened AGP analogs share the main chemical scaffold which consists of two cyclic groups (i.e. naphthalene/naphthol and oxalone) linked with ethyledine moiety. Bicyclic groups containing compounds manifest a myriad of bio-  www.nature.com/scientificreports/ logical activities 78 . These bicyclic groups containing drugs predominantly interact with proteins through hydrophobic, and aromatic residues. Further, it has also been demonstrated that the bicyclic compounds interact with the polar groups such as hydroxyl, and amides of the amino acid 79 . In this study, amongst the screened AGP analogs, there are two stereoisomers A1 (6S)/A3 (6R), and A2 (6S)/A4 (6R), whereas, analogs A5, A6, and A7 are distinct from each other. Compared to AGP, analogs A1 to A5 structurally differ at 5 and 6 positions of naphthalene moiety, while the rest of the chemical scaffold is the same. In place of naphthalene, analogs A6 and A7 possessed naphthol moiety and has a different group at the 3rd position of the naphthol ring. Further, molecular  www.nature.com/scientificreports/ interaction analysis revealed that amongst screened analogs naphthalene moiety-containing compounds showed more affinity compared to naphthol towards the COX-2 protein in terms of binding energy, H-bonds and HYD interactions (Table 1 and 2). It is evident from Fig. 2 and Table 2 of the molecular interaction analysis that the oxolane moiety is mainly involved in the H-bond interaction while naphthalene/naphthol moiety mainly concerned with the HYD interactions. This was further confirmed by quantum mechanical descriptor analysis. Structurally, compared to AGP, analog A1-A4 has one less -OH group at the 5 th position in the naphthalene moiety which increases the duration of action of these analogs which has been proved by ADMET prediction (Table 3). This further increases the HYD interaction of these types of analogs with the COX-2 protein and also increases the affinity with the receptor 80 . Therefore, based on the above relationship between the chemical structure features and the results obtained from the molecular docking and DFT analysis, we concluded that screened AGP analogs shared improved anti-inflammatory activity compared to AGP and the reference drugs (aspirin and rofecoxib).
MD Simulation. MD simulation analysis was carried out for AGP and hit analogs' complexes with AF-COX-2 protein along with appropriate controls {AF-COX-2 protein alone (neutral control) and its complex with aspirin and rofecoxib (positive control)}. The MD simulation trajectory analysis of AGP and hit analogs' complex was compared with native AF-COX-2 protein, aspirin, and rofecoxib complex in terms of RMSD, RMSF, residual area, Rg, SAS area, SAS volume, SAS density, protein H-bonds, and protein-ligand complex H-bonds. The RMSD measures the protein conformation by calculating the average distance between the backbone atoms of a protein during the simulation and assessing the structural stability. The average RMSD values for the AGP and hit analogs were found to be lower than the neutral control (native protein). In comparison with positive controls, the average RMSD value was observed to be lesser in analog A2, A3, and A4. The least value RMSD value was observed for the analog A3 indicating its ability to form a stable adduct with AF-COX-2 protein ( Fig. 5a and Supplementary Table S6). The RMSF is calculated for the residual mobility of each protein-ligand complex as well as the native protein. No noticeable difference was found between the residual fluctuation profiles of all the complexes (Fig. 5b). This may be ascribed to the formation of H-bonds and van der Waal interactions between amino acid residues and ligands. As shown in Fig. 5b, the active binding site regions have lower fluctuations indicating their stability in the AF-COX-2 protein. Further, the average RMSF was observed to be lower in the A1, A3, A6, and A7 complexes than in the neutral control as well as the positive control (Supplementary Table S6). The individual RMSF for the interacting amino acid residues in each simulation complex system was computed and compared with the respective average RMSF value for the simulation system. Further, from the Supplementary Table S6, the average RMSF range for all the studied simulation systems was found to be 0.16 ± 0.10 to 0.21 ± 0.18 nm. Also, it is evident that except Arg106, Ser339, and Phe 504, rest of the interacting amino acid exhibited the RMSF value lower than 0.21 nm (which is highest average RMSF value amongst all the simulation system) (Supplementary Table S7). This means that the interacting binding residues are quite stable during the MD simulation. Moreover, the lower average RMSF value for the analogs A1, A2, A3, A6 and A7 than the protein demonstrate that the analogs are capable of forming stable interactions with the protein during MD simulation. This RMSF study suggested the ability of these four AGP analogs towards the formation of stable interaction with AF-COX-2 protein during the period of simulation.
The SAS area for each residue during the simulation was also computed in order to calculate the area of each residue that is accessible for solvent (Fig. 5c) 81 . The SAS value per residue was found to be in the range of 0-2.49 nm 2 . It is evident that a few residues of the AF-COX-2 protein are not accessible for solvents such as Asn181 in rofecoxib and A2 complex, and Lys237 in A7 complex. Further, amongst all the studied analogs and control compounds, the average value of SAS area per residue was found to be highest for A3. Further, the maximum value of SAS area per residue (nm 2 ) was found to be in the order of A4 (2.49) > A5 (2.44) > A3 (2.27) > A6 (2.22) > Rofecoxib (2.21) > A7 (2.19) > A2 (2.18) > A1 (2.11) > AGP (2.06) > aspirin (2.06) > protein (2.05) which signifies that analogs A3-A6 have the better accessibility for the solvent than the other complexes as well as native protein (Supplementary Table S8).
The Rg analysis was carried out to calculate the level of compactness in terms of protein folding and unfolding of the AF-COX-2 protein structure in the presence and absence of the ligands. The average Rg value for analog A3 and A6 was found to be equivalent to the Rg value of native protein (neutral control) but lesser than the positive control. This suggests a more stable and compact protein-ligand complex formation compared to other analogs and positive controls (Fig. 5d and Supplementary Table S8) 82 .
The Number of H-bonds in protein as well as between protein and ligand play a significant role in the stability of complexes, therefore, H-bond analysis was performed for native AF-COX-2 and its complexes with AGP, hit analogs, aspirin, and rofecoxib. It is evident from Fig. 6a that the analog A3 complex exhibited a maximum number of H bonds in protein, whereas, AGP and A3 complexes exhibited the maximum number of protein-ligand H-bonds as compared to other analogs and positive (Fig. 6b). This suggests the stronger and more efficient binding of the analog A3 with AF-COX-2 protein which is in good agreement with the results obtained from RMSD and RMSF analysis (Supplementary Table S6).
SASA, the area of a complex surface that is accessible to a water solvent, is used to predict the level of conformational variations that occur through the interaction between proteins and ligands. Figure 6c shows the SASA plot for all the hit analogs, AGP, neutral, and positive control. The average SASA value for all the hit AGP analogs along with AGP was found to be more than the neutral control (native protein SASA = 273.57 nm 2 ). Whereas, as compared to the positive control (rofecoxib and aspirin complex SASA = 273. 13  www.nature.com/scientificreports/ (V sas ), the volume enclosed by the center of a solvent probe rolling around the protein, was also calculated for all the studied ligands which signifies the stability of the complex system 83 . Typically, it measures the effect of forces on the protein surfaces exerted by solvents followed by the protein-solvent interactions. Also, V sas is an alternative to refine the SASA term, in order to include the influence of the solvents' effect on the protein's interior and also define the interaction between the protein and solvent 84 . It is an accurate and fast application to examine geometric volumes of the proteins. In addition, it can be used to calculate the volume that changes due to the interaction of protein-ligand complex with solvent 85 . V sas along with SASA provides a better acquiescence of implicit and explicit nonpolar solvent forces on the protein and its effect on the protein folding 86 . As shown in Fig. 6d and Supplementary Table S9, AGP, A3, A4, and A5 analog complex showed a greater value of V sas compared to all controls, while the highest value was observed for the A3 analog complex (119.24 nm\s3\n). These results indicate that the A3 analog forms the most stable complex with AF-COX-2 protein compared to all other analogs and controls. Furthermore, the SAS density, which is inversely proportional to the SASA, was determined to calculate the neighborhood density of burial HYD amino acids within the protein core that leads to protein folding 87 . The neighborhood density calculates the precise molecular and atomic quantities with coordinates for protein surface area accessible for solvent. It not only measures the hydrophobic effect of the amino acids density but also captures the electrostatic effect of the solvent on the protein folding and stability 88 . As demonstrated in Fig. 6e and Supplementary Table S9, the A3 analog possessed the least value of SAS density indicating better protein folding compared to other hit analogs and all controls after interaction with AF-COX-2 protein.
All three solvent accessible surface analysis (area, volume, and density) together provide more emphasis on the MD simulation analysis related to the analogs' complex stability in terms of protein folding and revealed more stability of analog A3 than the other studied compounds including the reference molecules. Overall, from MD simulation analysis, it can be inferred that A3 analog exhibit better protein stability than the other hit analogs, AGP, native protein, aspirin, and rofecoxib. Moreover, the molecular interactions of the hit analogs were also analyzed after the MD simulation (Supplementary Fig. S3 and Table S10). Interestingly, except A7, the binding site for all the hit analogs were observed www.nature.com/scientificreports/ to be consistent before and after the simulation. Further, amino acid residues namely Val102, Arg106, Val335, Leu338, Ser339, Tyr341, Leu345, Phe504, Val509, Ala513, and Ser516 were found to be important residues before and after simulation as these are consistent before and after simulation. Further, a comparative analysis of before and after MD simulation in aspirin protein revealed that after simulation two hydrophobic interaction (Val509, Ala513) and two H-bonds (Arg106 and Arg499) are formed which were three and one respectively before the simulation. In case of rofecoxib, only three hydrophobic interactions (Leu338, Tyr371, and Trp373) were observed after the simulation, whereas before simulation one H-bond (residue) and six hydrophobic interactions (residue) were noted. Next, AGP protein complex exhibited six hydrophobic interactions without H-bond after simulation whereas three H-bond with six hydrophobic interaction were observed before simulation. Then, analog A1 protein complex showed two H-bonds with seven hydrophobic interactions after simulation while three H-bonds with five hydrophobic interactions were found before simulation. Analog A2 protein complex revealed nine hydrophobic interactions without any H-bonds after simulation whereas four H-bonds with six hydrophobic interactions were observed before simulation. Analog A3 protein complex displayed one H-bond with eight hydrophobic interactions after simulation while seven hydrophobic interactions along with one H-bond were observed before simulation. A4 protein complex exhibited one H-bond and seven hydrophobic interaction after simulation while three H-bonds with six hydrophobic interactions were observed before simulation. A5 protein complex revealed 2 H-bonds and seven hydrophobic interaction after simulation whereas 4 H-bonds with five hydrophobic interactions before simulation. A6 protein complex showed one H-bond with four hydrophobic interactions after simulation while only hydrophobic interactions were observed before simulation. Lastly, A7 protein complex revealed one H-bond with three hydrophobic interaction and only hydrophobic interactions were majorly involved before simulation (Supplementary Table S10). Overall, from the comparative analysis, it was observed that the hit analogs except the A7 remain close to the binding site identified by the molecular docking analysis which indicates the efficient binding stability of the complex of hit analogs with AF-COX-2. Also, analog A1, A3 and A5 complexes maintained their consistencies before and after simulation.
Electrostatic potential energy simulation of the peptide-ligand complex. The EPE maps describe the distribution of electrons on the molecular surface, HOMO, and LUMO energy of the molecules. In this study, we have calculated the EPE, HOMO, and LUMO of the selected sequence of AF-COX-2 protein and its adduct with the AGP, hit AGP analogs, native protein, aspirin, and rofecoxib. Then, the binding energy of the adduct was computed, where the more negative value of binding energy indicates the more stability of the protein-ligand complex 89 . For this study, aspirin, rofecoxib, AGP, and A1-A7 AGP analogs were selected as ligands, while the protein sequences from 511 to 520 amino acids of AF-COX-2, i.e. Val511, Gly512, Ala513, Pro514, Phe515, Ser516, Leu517, Lys518, Gly519, Leu520 were selected as a peptide for docking simulation study by calculating EPE 35 . Later, the calculated binding energy values of the adducts were found to be in the range of − 53.81 kcal/mol (A3) to − 34.53 kcal/mol (aspirin) ( Table 5). Further, all the AGP and all seven analogs exhibited lesser binding energy than the rofecoxib and aspirin, while the least binding energy value was observed for A3 suggesting the most stable protein-ligand complex. www.nature.com/scientificreports/ MM/GBSA rescoring analysis. The binding free energy of the interacting hit AGP analogs, AGP, aspirin, and rofecoxib to the AF-COX-2 protein was computed by MM/GBSA method for before and after simulation analysis. The results revealed that there is a subtle difference between the MM/GBSA score for before and after simulation analysis (Fig. 7). Also, It can be noted from Fig. 7 that amongst all the investigated compounds, the A3 analog showed the most negative binding free energy (before simulation = − 55.37 kcal/mol and after simulation = − 56.25 kcal/mol) in terms of MM/GBSA indicating its strong interaction with AF-COX-2 protein. The MM/GBSA score was found to be in the decreasing order of binding affinity A3 > A1 > A5 > Rofecoxib > A2 > A4 > A6 > A7 > AGP > aspirin. Further, it was observed that all the hit AGP analogs along with the AGP exhibited a more negative value of free energy compared to aspirin, whereas analogs A3, A1, and A5 showed the binding affinity better than Rofecoxib, aspirin, AGP, and other analogs. Moreover, analog A3, A1 and A5 complex with protein were observed to be more stable after the simulation analysis and also implies that the analog A3 showed the better binding towards the COX-2 protein as compared to the reference compounds and other studied analogs. The MM/GBSA results also corroborated the findings of docking, Table 5. HOMO, LUMO, HLG, and electrostatic potential energy (EPE) of selected chemical compounds, AF-COX-2 peptide, and the adducts of the compounds with the AF-COX-2 peptide, as well as the binding energy for the complexes. HOMO, highest occupied molecular orbital; LUMO, lowest unoccupied molecular orbital; HLG, HOMO-LUMO Gap; EPE, electrostatic potential energy, and BE, binding energy.  www.nature.com/scientificreports/ MD simulation, and DFT analysis, and establish that the A3 analog would be a promising compound to inhibit the AF-COX-2 activity and can be used as a promising natural anti-inflammatory drug. Overall, our results reveals that andrographolide analog A3 is the most potent and effective plant based natural drug molecule against COX-2 protein. The efficacy of this analog was compared with other natural compounds as well as synthetic compounds for COX-2 inhibition. The binding energy for A3 from docking analysis was found to be − 8.56 kcal/mol which is better than virgin coconut oil derivatives (BE range for the derivatives: − 5.65 kcal/mol to − 7.58 kcal/mol), alliin (− 4.90 kcal/mol), pinoresinolm (− 8.38 kcal/mol), and syringaresinol (− 8.23 kcal/mol) [90][91][92][93] . Further, MD simulation as well as EPE docking simulation analysis were carried for the complex stability for A3. Moreover, the detailed comparative analysis results with the reference compounds supports that the identified analog A3 would be an efficient drug-like candidate against COX-2 protein. To the best of our knowledge, to date, the andrographolide analog A3 has not been explored for anti-inflammatory activity by inhibiting the COX-2. Hence, present work provides a foundation for the development of andrographolide analog A3 as a potential anti-inflammatory drug-like candidate acting via COX-2 inhibition. Though, we have employed a myriad of cheminformatic tools for identification and validation of analog A3 as a potential COX-2 inhibitor, further experimental authentication is required from the scientific community to expedite the drug development process.

Conclusion
In the present study, a promising plant-based anti-inflammatory compound against COX-2 was identified by employing a combined computational approach including molecular and quantum docking simulation studies. A validated full sequenced human AF-COX-2 protein structure was used for multiple sequence alignment to find out the active site residues for anti-inflammatory activity. The virtual screening of 237 AGP analogs against AF-COX-2 protein followed by interactive docking studies screened out 7 hit AGP analogs which showed their drug-like candidacy also from ADMET prediction analysis. Further, ligand efficiency metrics, DFT descriptors, i.e. HOMO, LUMO, HLG, and EPE established their good chemical reactivity. In addition, MD simulation, EPE docking simulation, and MM/GBSA study revealed that A3 analog (3-[2-[(1R,4aR,5R,6R,8aR)-6-hydroxy-5,6,8atrimethyl-2-methylidene-3,4,4a,5,7,8-hexahydro-1H-naphthalen-1-yl]ethylidene]-4-hydroxyoxolan-2-one) forms the most stable complex with the AF-COX-2 and would be a promising natural anti-inflammatory agent. In near future, additional experimental validations are required to substantiate these findings.

Data availability
All data generated or analyzed during this study are included in this article and its Supplementary Information files. Additional raw data or datasets generated and/or analyzed during the current study are available from the corresponding author on reasonable request. www.nature.com/scientificreports/